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-Abstract- 

We introduce and study the computational power of Oritatami, a theoretical model to ex¬ 
plore greedy molecular folding, by which the molecule begins to fold before waiting the end of 
its production. This model is inspired by our recent experimental work demonstrating the con¬ 
struction of shapes at the nanoscale by folding an RNA molecule during its transcription from an 
engineered sequence of synthetic DNA. While predicting the most likely conformation is known 
to be NP-complete in other models, Oritatami sequences fold optimally in linear time. Although 
our model uses only a small subset of the mechanisms known to be involved in molecular folding, 
we show that it is capable of efficient universal computation, implying that any extension of this 
model will have this property as well. 

We introduce general design techniques for programming these molecules. Our main result 
in this direction is an algorithm in time linear in the sequence length, that finds a rule for 
folding the sequence deterministically into a prescribed set of shapes depending of its environment. 
This shows the corresponding problem is fixed-parameter tractable although we proved it is NP- 
complete in the number of possible environments. This algorithm was used effectively to design 
several key steps of our constructions. 

Keywords and phrases Molecular Folding, Computational Geometry, Self-Assembly, Tag Sys¬ 
tems 


1 Introduction 


The process by which one-dimensional sequences of nucleotides or amino-acids acquire the 
complex three-dimensional geometries of biomolecules is a major puzzle of biology today. In 
particular, the problem of predicting how proteins fold is a major source of interest, as it 
could potentially allow us to engineer our own proteins. 

A few year ago, the kinetics of folding, which is the step-by-step dynamics of the reaction, 
has been demonstrated by biochemists to play a fundamental role in the final shape of 
molecules fH;, and an essential role in the case of RNA [5 • In recent experimental results ,7], 
researchers have been able to control this mechanism to engineer their own shapes out of 
RNA. 

One of the most widely used techniques in DNA nanotechnologies, DNA Origami 
requires the molecules to be heated up to high temperature (about 90C) before being slowly 
cooled down at a precisely controlled rate. In contrast to this, one of the main benefits of 
RNA Origami |7j is the possibility of controlling folding at temperatures compatible with 
human life. 
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Previous theoretical studies on folding focused mostly on the energy optimization mecha¬ 
nisms. For example, in different variants of the hydrophobic-hydrophilic (HP) model it 
has been shown that the problem of predicting the most likely geometry (or conformation ) 
of a sequence is NP-complete mum USE], both in two and three dimensions. 

Here, we focus on kinetics, a different and complementary mechanism. We introduce a 
new model based on the experiments conducted in [T to explore the perspectives opened by 
co-transcriptional folding. In particular, in co-transcriptional folding, molecules fold in linear 
time, which allows us to focus on understanding and developing design paradigms. 

Main contributions. We introduce a new model of molecular folding where the molecule 
gets folded while being produced. More precisely, we consider a sequence of “beads”, or 
abstract basic components which may stand for nucleotides or even sequences of nucleotides 
(or domains ). In our model, only the latest produced beads of the molecules are allowed 
to move in order to adopt a more favorable configuration. The folding is driven by the 
respective attraction between the beads. 

We first demonstrate as a proof-of-concept how one can design a binary counter using this 
mechanism. We then show that our model is able of efficient universal Turing computation. 
This result heavily relies on the efficient simulation of Turing machines by tag system, 
from [9]. 

Building a tag system simulator not only shows the model to be powerful, it also pointed 
us explicitly to the challenges of molecular engineering. Namely, it led us to develop modular 
constructions and techniques to produce different shapes from a unique sequence in reaction 
to its environment. Furthermore, it taught us how one can prepare this environmental 
changes to trigger calls to specific functions encoded in the sequence. 

Moreover, our constructions also motivated the development of an algorithm running in 
time linear in the sequence length, that finds an attraction rule for folding a single sequence 
deterministically into a prescribed set of shapes, depending on surrounding beads. As a 
consequence, even though we will show that the problem of finding a rule is NP-complete, 
we have been able to implement and use this algorithm to resolve some parts of our designs. 

2 Model and Main Results 
2.1 Model 

Oritatami system. Oritatami is about the folding of finite sequences of beads, each from a 
finite set B of bead types , using an attraction rule ■¥, on the triangular lattice graph T = (Z 2 , 
where (x, y) ~ (u, v ) if and only if (u, v) £ {{x — 1 , y), (x + 1, y), (x, y + 1), (x + 1 , y + 1), 
(x-l,y-l),(x,y-l)}. 

A conformation c of a sequence w £ B* is a self-avoiding path of length ^ labelled by w 
in T, i.e. a path whose vertices ci,..., q are pairwise distinct and labelled by the letters of 
w. A partial conformation of a sequence w is a conformation of a prefix of w. For any partial 
conformation c of some sequence re, an elongation of c by k beads is a partial conformation 
of w of length \c\ + k. We denote by C w the set of all partial conformations of w. We denote 
by c >k the set of all elongations by k beads of a partial conformation c of a sequence w and 
by c <k the singleton containing the prefix of length \c\ — k of c. 

An Oritatami system O = is composed of (1) a (possibly infinite) primary 

structure p , which is a sequence of beads , of a type chosen from a finite set H, (2) an 
attraction rule , which is a symmetric relation ?C B 2 and (3) a parameter S called the delay 
time. 
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Given an attraction rule and a conformation c of a sequence re, we say that there 
is a bond between two adjacent positions q and Cj of c in T if W{ 9wj. The energy of a 
conformation c of re, written E(c), is the negation of the number of bonds within c : formally, 
E(c) = -| {(ij) : c i ~ c j> 3 > i 2 + lj and Wi9wj}\. 


Oritatami dynamics. A dynamics for a sequence re is a function T) w : 2 Cw 2 Cw such that 
for all subset S of partial conformations of length i of re, D(S') is a subset of the elongations 


by one bead of the partial conformations in S (thus, partial conformations of length i + 1 ). 
Given an Oritatami system O — (p, 9 , 8) and a seed conformation cr of a seed sequence 


s of length t, the set of partial conformations of the primary structure p at time t under 
dynamics D is D^ p ({cr}) 0 i.e. the set of all elongations by t beads of the seed conformation 


prolongated by the primary structure according to dynamics D. 

We explore greedy folding dynamics where only the most recently transcribed beads can 
move, all other beads remain in place. These are controlled by integer parameter S (in this 
article, 5^4). We define two main dynamics: 

Oblivious dynamics consists in placing the last 8 beads in the minimal energy positions, 
regardless of their previously adopted positions 0 



The resulting conformations are nondeterministic. And, the nondeterministic position of 
the i-th bead of p is final at time i + 8. 

Hasty dynamics does not question previous choices but chooses the energy minimal positions 
for the 5 last beads among all elongations of the previously adopted partial conformations. 
It lets the S — 1 already placed last beads where they are and abandons the extension 
of a conformation if no extension with the newly transcribed bead allows to reach a 
lowest energy conformation available for the 5 last beads. Formally, J~C starts from a set 
of partial conformations, elongates each of them by one bead, and keeps the elongated 
conformations that have minimal energy among those who share the same prefix of length 


H + t — 6: 



Note that both dynamics may selects conformations of different energy levels as geometric 
constraints may differ from one conformation to an other and lead to different minimization 
landscapes. 

An Oritatami system O — (p, ■¥, 8) is deterministic for dynamics D and seed a of sequence 
s if for all i ^ 1 , the position of the i-th bead of p is deterministic at time i — 1 + 5, i.e. 
if for all 2 ^ 1 , |{c| cr |_j_i : c £ £>sp 1 +< 5 ({c r })}| = 1- We say that O stops at time t with seed 
cr and dynamics V if Dl p ({cr}) = 0 and D^ p ({cr}) 7 ^ 0 for z < t. Typically, the folding 
process stops because of geometric obstruction (no more elongation are possible because the 
conformation gets trapped in a closed area). 


1 Given two words a, b £ B *, we denote by ab their concatenation. 

2 We denote by arg min^x f(x) the set of the minima: arg mirn^x f{x) = {y £ X : f(y) = mirn^x f(x)}. 
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2.2 Main Results 

Turing universality. Our first main result shows that there is a Turing-universal Oritatami 
system, able to simulate the execution of any Turing machine with only a polynomial 
slowdown. 

► Theorem 1. There is an oblivious deterministic Oritatami system U = (p, qp, 3) and a 
log -space reduction from any Turing machine M and any input x to a seed configuration 
&M,x, such that starting from seed conformation ctm,x, Id stops if and only if M. accepts x. 

Moreover, if M. halts after T steps on input x, U halts after folding 0(T 2 log T) beads. 

In particular, the total number of bead types as well as the period of p in U are bounded 
by a universal constant. 

Rule design. Our second main result concerns the design of a rule for achieving a set of 
given foldings depending on the environment. 

Input: A set of beads B D {1,..., n}, a delay time 5, k seed conformations o\, ..., cik of 
sequences s \,..., Sk E B* (with possibly different lengths) and k target conformations 
ci,...,c/c of the n — S first beads of the sequence p = (1, ...,n), and a dynamics 

Output: A rule ?C B 2 such that for all i = l..fc, the Oritatami system O = (p, V, S) folds 
the n — S first beads of p deterministically into cqc* from seed conformation cq with 
dynamics D, i.e. such that < D 1 ff p S = {c^q} for all i = l..k. 

► Theorem 2. The Rule design problem is NP-complete for all 5 ^ 1 and n ^ 1. 

However, it is FPT as it can be solved in time 0(C s ' k n ) for some C > 0, linear in the 
length n of the primary sequence. 

3 Proof of concept: Folding a binary counter 

The goal of this section is to prove the following theorem: 

► Theorem 3. There is a 60-periodic primary sequence s such that for any integer n, there 
is an encoding of n into a seed a of width 0 (log 2 (n)), such that s folds into a structure 
encoding successive increments of n, using the hasty dynamics. 

More precisely, if n is initially encoded on b bits, and the seed is on row 0, then for all 
i < 2 b — n, row 6i contains the binary encoding of n + i. 

3.1 Base mechanism 

The base idea of this construction is to use one rectangular area of the plane (in fact, slanted 
rectangles) called perimeters to perform each operation. We encode a carry and bits of 
the current value of the counter, in the following way: the carry is encoded by the input 
position of the first beads in the perimeter, and the bits are encoded by bead types around 
the perimeter. 

Our construction progresses downwards and in zig-zags: every zig pass (right to left pass) 
computes the next encoding, and every zag (left to right) pass copies the value to its bottom 
row, and goes back to the starting point to begin the next round. 

The rule is chosen so that the shape of the primary structure in each perimeter encodes 
the result of the local computation on its bottom row (in a zig pass, the local computation 
means propagating a carry, and in a zag pass, copying the previous value). 
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Figures [l] and [2] show the reading of 1 with a carry of 1, and the reading of 0 with a carry 
of 1, respectively. In both figures, the folding starts on the bottom row of the perimeter, 
encoding a carry of 1. In Figure [TJ at the end of the process, the bottom row contains the 
encoding of 0, and the last bead is on the bottom row, which encodes the next value of the 
carry (1). On Figure [ 2 J the bottom row contains an encoding of 1, and the last bead is on 
the top row, which encodes the next value of the carry (0). 



Figure 1 The first 12 steps in the “half adder” perimeter: the carry is 1 (encoded by the position 
of the first bead on the bottom row), and the primary structure “reads” a 0, and outputs a 1. 




Figure 2 The first 12 steps in the “half adder” perimeter: the carry is 1 (encoded by the position 
of the first bead on the bottom row), and the primary structure “reads” a 1, and outputs a 0. 


3.2 The global construction: modules and functions 

Figure [3] shows three successive iterations of the counter: starting from the seed in orange, 
it first does one zig pass (right-to-left), and then proceeds in zig-zags. Each zig pass uses 
three consecutive rows of the grid, and each zag pass also uses three consecutive rows. In 
this section, we explain how to design and analyze such a system. 

Modules and functions Since our primary structure is periodic of a fixed period for any 
bit width, we have to use the same parts for the zig and the zag passes, although they are 
built in a different direction. This means that the primary structure is cut into a number of 
parts called modules , each module having different functions. Formally, a module is a factor 
(contiguous subsequence) of the primary structure, and a function of a module is given by 
(1) a perimeter, (2) the position of the first bead in the perimeter, (3) the beads surrounding 
the perimeter and (4) the conformation of the primary structure restricted to the perimeter. 

Modules in our construction In this construction, each period (60 beads) has four modules: 
the first one (12 beads) is a half-adder module, the second one (18 beads) is used for U-turns, 
the third one (12 beads) is another half-adder module, and the fourth one (18 beads) is used 
for U-turns. 

Encoding of the current value The value of the counter is encoded in binary at the 
beginning of every zig phase: reading the bottom row of the whole conformation, the first 
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Figure 3 Three iterations of the counter, shown the two remaining modules (in red, the first 
module has bead numbers 12,..., 39 and the second one has bead numbers 42,..., 59) have four 
different conformations each: two in the middle of right-to-left passes (in the carry and non-carry 
case), one on the sides (on the left-hand side for the first module, and on the right-hand side for the 
second module), and one on the left-to-right pass. 


three beads encode a signal to start a U-turn, and then two kinds of words alternate: bit 
encodings, on 4 beads, and “silent” sequences of 6 beads, encoding nothing. The alternation 
starts and ends with a bit encoding. 

Functions We list the functions for each of the four modules: 

m Half-adder modules has six functions: during the zig passes, each of the two half-adder 
modules can read two possible values, and start at two possible position. During the zag 
passes, each of the two half-adder modules can read two possible values, and only starts 
on the bottom row. 

h U-turn modules have four functions: during the zig passes, each of the two U-turn 
modules propagates the carry between half-adder modules, hence U-turn modules have 
two functions, one for each value of the carry. During the zag passes, the U-turn modules 
always start and end on the bottom row. Therefore, they have one function. Moreover, 
each U-turn module module has another function, which is assembling a U-turn on the 
left-hand side and on the right-hand side of the configuration, to alternate between zig 
and zag passes. 

Now, our construction assumes that the initial integer is encoded on an odd number of 
bits. Therefore, each of the U-turn modules is either always on the left-hand side, or 
always on the right-hand side of the conformation. 

The full rule is shown in Appendix |C| 
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4 A Turing-universal Oritatami system 

In this section, we demonstrate the existence of a single periodic primary structure that can 
simulate any Turing computation. Precisely, our construction simulates a particular type of 
tag systems which are known to simulate in 0{T 2 InT) steps any Turing machine running in 
T steps [9] . Our simulation uses the oblivious dynamics with delay time 3. 

Skipping Cyclic Tag systems A skipping cyclic tag system consists of a set of n productions 
po, • • • 5 Pn—i £ {0,1}* and an initial word w° G {0,1}*. At each time step, the tag system 
cycles through the productions and decides to append the current production or not depending 
on the letter read. We denote by the word at time t. Precisely, at time t = 0, the pointer 
q° is set to 0 . At all time £, 

h If w f is the empty word e, then the tag system halts and outputs q f . 
h Otherwise, if the first letter w\ of is 0, then set q t+1 := (q 1 + 1 ) mod n and 
w t+1 := w\ ... w^tp the suffix of without its first letter. 
h And if w\ = 1 , then the tag system appends the next production to w l and skips to 
the following production, i.e.: w t+1 := w\ .. .w* wt yp q r where q' = (q f + 1) mod n and 

qt+i ._ _p 2 ) m od n. 

For instance, the skipping tag system (e, 100,1, 0) has the following execution ((w t ,p q t)) t from 
input word w° = 010 : ( 010 , e), ( 10 , 100 ), ( 01 , 0 ), ( 1 , e), ( 100 , 1 ), ( 000 , e), ( 00 , 100 ), ( 0 , 1 ), (e, 1 ) 
and outputs thus 1. This example is reproduced in a more readable setting in Appendix |D| 
The following assumption will ease our design: we can assume that n is a multiple of 
4 by doubling 0s in the productions and adding empty word productions (folklore). The 
following of the section will describe how to simulate any skipping cyclic tag system with 
n = 0 mod 4 productions. 

4.1 Principle of the design 

Figure [4] presents the global design for our simulation on the example of the skipping tag 
system (e, 100 , 1 , 0 ) with the same input word 010 as above. As for the counter before, the 
simulation proceeds in forward-backward swipes of the encoding of the current word. Each 
forward (left-to-right) swipe trims all the initial 0 s from the beginning of the word until a 1 
is met, then rushes to the end of the word to append the corresponding production. The 
following backward (right-to-left) swipe rewinds to the position in the word just after its first 
1 while copying its letters down bellow for the reading of the next swipe. The construction 
continues until running out of letters in which case the folding gets trapped into a finite 
space and halts. 

Production encoding. Each production of the tag system is encoded in the molecule as a 
module, all of equal length. Each production module is composed of the exact same elements, 
only the encoding of the letters inside the module changes from one production to another (see 
Fig# Precisely, if L = max^ \pi | denotes the maximum length of a production, the produc¬ 
tion module for p { is the sequence of submodules (I3,[U, C , (WSi)a=(v,),:i=i..\vP^ UtII 03? EH) 
with k — L — \pi\. 

Module A : I nit is a simple module building a simple scaffold for the following modules; it 
always folds in the same way. 

Module B Empty word probe is a very short module that is sensitive to the presence of 
an non-empty word above it; if the word is empty, then it folds to the left, blocking the 
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Figure 5 The production module (folded upright) corresponding to a production 10 in a tag system 
where all productions have length at most 3 (hence, Padding submodule [Ej takes parameter 1). 


molecule into a finite space, halting thus the co-transcriptional folding and simulating 
the halt of the tag system. Otherwise, it folds to the right and the folding continues. 

Module C : End of word probe is sensitive to the end of the word; if the end of word is 
reached, it folds in a way that initiates the appending of the letters of the production 
module; otherwise, it initiates the compact folding of the production module. 

Modules [E?3 and [B^] : Letters encode the letters of the production; it can fold into two 
main forms: compact, where the letter are hidden from the reading head in Module G ; 
or expanded, when the letters are appended at the end of the word. 

Module Efc : Padding & Carriage return has two purposes: first, ensure that all production 
modules have the same length by padding with k = L — \pi\ spaces each production pi 
so that they all have the same length; second, reverse the direction of the folding to 
accomplish a "carriage return of the molecule" once the current production letters (in 
expanded form) have been appended to the word, marking the end of the forward swipe. 

Module m Term as for Module A, is used to built a scaffold along which the next module 
folds. 

Module G Read, Copy & Line Feed is the real "brain" of the molecule; in the forward 
swipe, it first reacts to the letters of the word by folding so as to skip the initial Os until 
it finds a 1 which has the effect of mirroring the following production modules; when the 
production modules are mirrored, G folds in a way that copies the letters read above 
down bellow; then, at the end of the backward swipe, when it reaches the beginning of 
the first letter of the current word, the |G| spontaneously folds to extend further down 
bellow starting a new line for the next forward swipe to begin. 


The production block automaton. Fig. [7] shows the canvas underlying to our design. Our 
construction is best understood in terms of production blocks. A production block consists in 
the folding of either a single production module or of a series of n consecutive production 
modules, i.e. the union of the bricks of one or n consecutive production modules. The states 


of the automaton in Fig. 
and indexed by q in Fig. 


7 are production blocks. The leading production module (in yellow 
7) in each production block corresponds to the current production 


of the tag system at that moment in the simulation. 
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Each production block plugs into the previous one at the red >s. The automaton places 
the block one after the other in the plane. Starting from the seed conformation in brown 
and following this automaton, one will retrieve the design in Fig. [4] as shown on Fig. [9] The 
disjonctions in the automaton are based on the occupation of specific locations: in the real 
Oritatami system, the presence of beads at these locations results in different foldings which 
are materialized here by a disjonction between different production blocks. The letters of the 
simulated word are encoded by a blue bump for each letter 0 and a flat surface for each 1, 
and can be read on every horizontal line above or bellow the blocks in Fig. [4] and [9j 

► Lemma 4. The production block automaton simulates faithfully any skipping cyclic tag 
system. 

Proof sketch. The automaton behaves as expected: it first trims the leading Os of the word 
passing the lead to the next production each time, until it finds an 1; then it passes the lead 
to the next production, copies the remaining letters, and appends the current production 
letters to the word, passes the lead to the next production and rewinds (while copying the 
letters from right to left) to the position just after the 1 last read. If it ever runs out of 
letters, the automaton halts (see Fig. [9|. ◄ 

Designing the modules. The remaining of the section consists in explaining how to design 
the modules |A|, ..., G so that the resulting Oritatami system folds as indicated by the 
production blocks automaton. 

Due to space constraints, we will not provide the full proofs of the correctness of the 
folding but only focus on specific issues that required specific and potentially inspiring tools 
to be resolved. We refer the reader to the videos available at [6] for a full demonstration of 
the resulting Oritatami system folding its modules live upon itself. The full description of 
the modules and rule is given in Appendix. Table [l] summarizes the different conformations 
that adopts each module in the various stages of the simulation. 

4.2 The design of the modules 

Recall that our simulation uses delay time 3, that is to say only the last three beads produced 
are looking for their best locations. Our design is deterministic, the position of the beads 
older that 4 time steps are fixed and unambiguous. 

First, remark that the production modules adopt three main conformations in the 
simulation Fig. [4] and [7| 

Upright: the "pointy ears" of the production modules point northeastwards; these conforma¬ 
tions appear during the reading forward phase. 

Mirrored: the "pointy ears" of the production modules point southeastwards; these confor¬ 
mations appear during the copying forward phase. 

Rotated: the "pointy ears" of the production modules point northwestwards; these confor¬ 
mations appear during the copying backward phase. 

Note that these conformations are related by symmetries (vertical mirroring, or rotating by 
180°) that preserve the neighborhood relationships in the triangular grid. It follows that 
each module will fold identically whatever the orientation of the production module is (NE, 
SE or SW) as long as it is immersed in the same environment. In the following, we will 
thus present the foldings of each module in only one of these three main conformations. The 
results will hold in the others as well by symmetry. 
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Basic scaffolding: Modules A and m Our construction uses rigid scaffoldings named 
gliders , see Fig. [6] Gliders are rigid (they support themselves) and require only few bonds 
(one every 3 positions on average). It is easy to check that glider fold as expected and requires 
only 6 different beads, corresponding to a period of the glider pattern. A and uses gliders 
to build a rigid scaffold described in Fig. [T2| and [22] along which the following modules will 
fold. 

Adopting either a compact or expanded form: Modules [Bp] , E^, and G . Our 

design requires to be able to store the letters of the production into a compact form inside 
the production module and to be able to expand them into a glider when appending the 
letter at the end of the word. The compact form is called switchback. Remarking that the 
pointed ends of the switchback are similar to the gliders, we have obtained a bonding scheme 
compatible to both switchback and glider as shown on Fig. [6] The magic resides in the fact 
the form is controlled by the placement of the first three beads: if they adopt a glider form, 
the rest of the molecule will fold into a glider; if they adopt the switchback form, then the 
rest of molecule as well. This allows us to have the modules [Sp, [^, E^, and G to contract 
or expand at will by forcing the placement of the first three with strong bonding to the 
environment! Note that each of the switchback strands can be extended as much as wanted 
by repeating the same 12 beads, this allows to construct switchback compatible with glider 
of arbitrary height (as long as it is a multiple of 12). 

Detecting ends: [b], C , and G . End detection is obtained by realizing various level of 
attachment of a given module: by default it will fold in a certain way, but presented with 
some specific environment, it will bind strongly with it and change its shape. We refer to the 
appendix and the folding of [§J for more details on the process. 

Implementing various functions: G . G is a very sophisticated structure that needs to 
implement many different functions: reading, copying, and line feeding. It is also responsible 
for the major changes in the geometry of the folding by reversing the production modules. 
"Calling" the different functions is achieved by shifting the module along its environment. 
Precisely, on the one hand, in the upright conformation of a production module, the area 
bellow the production module is cleared and G will fold its first 8 beads bellow, shift its 
relative position to the preceeding module [9 The effect is striking: G will fold as a 
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glider and enter in its "reading" mode. On the other hand, in the mirrored and rotated 
conformations, the area above the production module is occupied and G naturally folds 
along adopting its switchback shape activating its "copying" mode. 

Note that we had to extend this shifting paradigm furthermore to separate functions 
that would naturally take place at the same beads of the module, making it unfeasible. We 
proceed by introducing delays in the glider mode using what we call chaussettes : we let the 
end of the switchback strands fold for a while outside the glider in the glider mode as shown 
in red on Fig. [23j[24j and [27] in the appendix. This allows use to separate these critical parts 
which are dedicated to the copy of the letters (see 25 26) from the reading and line feeding 
modules which are deported 72 beads away from them. 

A last but not the least challenge is to ensure with a small constant number of beads 
that the switchback form of G is "glued" along while permitting the glider form of 
G to grow without bonding to [j^anv where. The difficulty resides in the fact that the 
glider is three times slower than the switchback which voids any approach based on affine 
shifting. We solved this issue by coloring the beads in the modules |G| and using an 
logarithmic scheme: the ith beads in and G receive essentially (upto some shift) the color 
(|_log 3 i\ mod 4, i mod 12) E [4] x [12], and binding only beads with identical color does the 
trick. This coloring scheme is superimposed over the glider/switchback beads scheme. 

Full details on this construction can be found in appendix. Figure [8] shows a picture of 
our Oritatami molecule folded so as to simulate a skipping cyclic tag system. 


5 Rule design: hard but feasible 

An important problem related to our two main constructions (Sections [3] and [4| is the 
problem of finding an attraction rule such that a primary structure folds into its correct 
functions. This section introduces an algorithmic approach to this problem, called the rule 
design problem , and specified as follows: 


Input: 

a delay time 5 , a list of n > 0 seeds or, <72,..., <7 n , and a list of n confor¬ 
mations ci, C 2 ,..., c n of the same length l 

Output: 

an attraction rule such that for all i E {1, 2,..., n}, Oritatami system 
Oi — (s,<7i,^, (5) deterministically folds into conformation c*, where s is 
the sequence of length l such that for all i E {1,2,...,/}, Si — i. 


The two following lemmas yields Theorem [2] (proofs are ommitted and may be found in 
the appendix) 

► Lemma 5. For any positive delay time, the rule design problem is NP-complete. 

► Lemma 6. The rule design problem with n target conformations, each of length l, and 
delay time 5 is NP-complete but fixed-parameter tractable, as it can be solved in time and 
space complexity /2 5n<5+6Efc l crfc l. 
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A Omitted figures for the tag system 
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Figure 7 The production block design automaton. 
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Figure 9 Simulation by the production block automaton of the same skipping tag system as in Fig.K 
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FORWARD SWIPE 




READ THE FIRST LETTERS OF THE WORD 
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■ Table 1 The various foldings of each submodules in a production module in the various stages of 
the simulation. The little dents )) in each module indicates the locations of the beginning and the 
end of its folding. 
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B 


Perspectives 


The purpose of our new model is not to be entirely accurate with respect to phenomena 
observed in nature, but instead to start developing an intuition about the kind of problems 
that need to be solved in order to engineer RNA shapes, and later, even proteins. This 
approach can be compared to learning programming in a high-level programming language 
before learning assembly code. 

For instance, our Turing machine simulation reveals and shows how one can exploit the 
fact that small shifts in the sequence can expose different functions of the same part of a 
molecule. This could be a new pattern to look for in conformation databases. 

In the future, a number of extensions of this model seem natural. In particular, extending 
it with a more realistic notion of thermodynamics and molecular agitation. Using existing 
works in molecular dynamics m, would allow to explore a stochastic optimization process 
instead of a deterministic one. Moreover, this would also allow to study reconfiguration of a 
conformation. 


C Proof of correctness for the counter 

Proving the correctness of this construction means checking that each function is folded 
correctly. The full rule is shown on Figure [lO] Since each half-adder module has six functions, 
each U-turn module has four, and there are two instances of each of them, there are in total 
20 functions to prove. Since the dynamics considers many cases for each function, we can 
simply run a computer program to check that all 20 functions are folded correctly. 

Note that modules are not exactly independent: the last 5 beads of each module depend 
on attractions between beads of the next module. However, since the order in which modules 
appear in the primary structure is always the same, it is sufficient to run a simulation of the 
dynamics for the periodic structure, until all functions have occurred, which happens when 
counting up to 8 with 3 bits. 



Figure 10 A representation of the full rule for the counter. Bead types are written on the sides 
of the diagram. From each bead type, zero, one or two lines can start. For any pair (bo, b±) of bead 
types, bo bi if and only if the line starting from bo intersects the line starting from bi with a black 
circle. Note that this representation is symmetric. 
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Figure 11 The tree of all possible conformations for the function of the first half-adder and first U-turn modules that read a 0 with carry 1, and folds the 
U-turn into vertical zig-zags of height 3. At each step, the conformations of minimal energy are drawn in bold line, and the selected one is the one with a child 
in bold line. 
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D Example of skipping tag system 
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(halt) 


E Full description of the SCTS simulation Oritatami design 

This section gathers the full description of the attraction rule for each module with their 
environment. 

The construction is defined by two parameters: w = 6(L + 9) + 18, the width of a 
production module minus the width of its last module, G (in its compact conformation), 
and h = n{w + 6) — (w + 3), the height of the production modules. All the modules are 
described with respect to these two parameters. The following relation will be of highest 
importance in the design of rule. Recall that we assumed n = 0 mod 4 (duplicating 0s in 
the productions and adding empty production allows this assumption). This implies that: 

w = 0 mod 6, 
n(w + 6) = 0 mod 24, 
and h = 3 mod 6. 


Module A is a very simple glider structure with very limited interactions with its environ¬ 
ment. It is 3h — 2 beads long, 3 beads wide and h beads high. Its standalone glider structure 
requires only 5 beads for the first five beads, then 6 beads for the glider patterns repeated 
(3/i— 2 ^)—5 —2 ^ ^ times, plus 2 beads to conclude the construction. Its only interactions with 
its environment are between its few first beads and the last beads of IB Fig. 12 provides its 
full description. 


Module [b] is a very simple yet carefully designed module so that it spontaneaously folds 
to the right but will fold to the left if in presence of alone, but will fold right if E lies 
next to (because it gets attracted by E , see Fig. [l3|, which means that the current word 
is not empty. This will ensure that the folding will end ([§] folds to the left) if and only if 
the curent word is empty. It is 5 beads long, and its non-halting folding is 3 beads wide and 


3 beads high (see Fig. 13). 


Module C . is fairly simple. It naturally folds into three switchbacks along A (Fig. 14), 
but gets attracted but in which case it climbs higher and folds into two switchbacks 
(Fig. [l5| which will trigger the appending of the current production to the word. It is 3h — 10 
beads long. One can check that (3 h — 9)/2 G N. Each of the switchbacks follows a periodic 
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Moduitiyiiirr 

Length = 3h-2 



Figure 12 Module A : Init. 


Module B: Non Empty 
Word detected! 

Length = 5 

G F:Term E:Padding 


</£/l&l/l/ 


A:lnit 



Module B: Empty 
Word detected! 

Length = 5 

y^>A:lnit 


Figure 13 Module |B|: Empty word probe detects empty word and folds to the left if and only if 
Fj is present but not E . 
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Module C: End of 
Word Probe 


Length = 3h-10 



B: Empty 
word 
probe 


Height = h - 3 


Width = 3 


Figure 14 Module C : End of word probe in compact switchback folding. 

pattern and 6 beads are enough for each of them. It follows that 24 beads are enough for its 
design. 


Modules i[[po and [Bjfl . adopt the Glider/Switchback structure described earlier with 6 
switchbacks of length n(w + 6)/2 = 0 mod 12 each, thus a total length of 3 n(w + 6). The 
switchback can be repeated period 4. It follows that for the main switchback structure 
4 x 12 beads are enough. Now, for the bump in [go, 8 extra beads are needed ([^ and 
are otherwise identical). Furthermore, in order to avoid unwanted interactions between the 
expander glider letters we equip the bottom of each switchback with boots of length 16 (in 
red in Fig. 19) which requires 4 x 16 more beads. Thus a total of 120 beads. 


Module [E^ is basically the same as with 6(L — k + 9) of length n(w + 6)/2 each, 
followed by a long glider of length 3ft — 1 in the middle and 5 long switchbacks of length h 
each. Its total length is thus t — 3n(L — k + 9 )(w + 6) + 8h — 1 = 23 mod 24. It must however 
be able to fold spontaneously upon itself in glider mode around 3c = 0 mod 18 where 
c— (^ + l)/4 = 0 mod 6. Note that for our choice of parameter 3c < 3 n(L — k -f- 9)(w + 6) 
and thus the fold back turns appears as shown in Fig. [20] inside the short switchbacks. It is 
thus enough to use 2x(4xl2 + 4xl6) beads for the two phases of the short switchback 
(before and after the foldback point) plus 18 beads for the long glider and 5x6 beads for the 
long switchbacks (6 beads are enough since they are not required to fold into glider). Thus, 
a total of 272 different beads are enough (this number can be reduced considerably by more 
careful adjustments). 


Module m is, as mentioned earlier, treachearous because even if its structure is very simple, 
it must allow G along its side in switchback mode and in glider mode (three times as slow). 
This requires the use of sophisticated coloring of the beads along its side. As shown earlier, 
4 x 12 beads are enough using the exponential periodic coloring scheme. Add to that the 12 
beads for the glider and we get a total of 60 beads. Its total length is 4 h and its structure is 


always the same: 4 beads wide and h beads high (see Fig. 22). 
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Module C: End of Tape Word Probe 
Length = 3h-10 



Figure 15 Module C : End of word detected 
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MaAii DIHrat: latter On® 

Length = 3n(w+6) 



Figure 16 Module [B^: Letter One in switchback folding when it is flrst letter. 


Module D 0 F i rst: Letter Zero 
Casen«4 

Length = 3n(w+6) 



Figure 17 Module [Dpi: Letter Zero in switchback folding when n = 4. 
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IMUi BOFSret: Letter Zero 
Cessna: 8 
Length = 3n(w+6) 


Length 



Figure 18 Module [0S: Letter Zero in switchback folding when n ^ 


Module EH is the brain of the construction, it cumulates a large number of functions: Read, 
Copy (both forward and backward), and Line Feed. Using our chaussettes systems, we were 
able to create an offset between all these functions and place them at different location in 
the module. Its full description in these various situations may be found in Fig. [23) [24j [25j 
|26) and|27| Each black part but the first one requires 12 beads to be implemented and there 
are 7 of them, thus a total of 84 beads. Each red part requires its own beads, thus a total of 
3 x 14 + 18 + 15 + 29 + 6 = 110 beads. The first black part consists of 48 unique beads (even 
if we can save much more here), followed by a glider colored exponentially with 60 beads to 
match the coloring of the on its sides. It follows that implementing module [gJ requires at 
most 254 different beads only! Its total length is 6ft — 1 and consists in either 6 switchbacks 
of height ft, or in a glider ft + 3 = n(w + 6) — w wide (i.e. the width of n — 1 production 
modules folded upright plus a nth production module without its G) . 


F Omitted proofs for the Rule design 
F.l NP-completeness 


Proof of Lemma [5j We reduce from 3-SAT with n variables and m clauses to the rule design 
with n + ra different conformations to be uniquely folded simultaneously. Let xo, aq,..., x n -i 
be the variables, and Fq, F\,..., F m _i be the clauses of a 3-SAT formula. 

We will encode all 2n possible literals by distinct bead types in seeds, one for each possible 
literal. Figure 28 shows the encoding of a clause of the form li A lj A 4, where lj and Ik 
are literals. 

Then, if a rule folds all conformations obtained by our reduction correctly, we will set 
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Module E(r,L): Padding 
Length = 3n(w+6)(L-r+9)+8h-1 



<-> 


L-r+9 times Width = 3 Width = 5 

LengthA = 3(L-r+9)n(w+6) LengthB = 3h-1 LengthC = 5h 


Figure 20 Module [E r J: Padding in compact switchback folding. 


Module E(r,L): Carriage Return 

\ \ C: End of word probe Length . 3 n(w+6)(L-r+9)+8h-1 

A: Init > \kf If first: 




B: Empty 
word probe 


q=1,3,5,... I 6(L-r+9)-1 

qxn(w+6)/2 


D: Letter 


If not first: 



q=1,2 I ...,3(L-r+9)-1 
qxn(w+6) 


'W 





Height = 3 




3(Length+1)/4 = 0 mod 6 


Width = [3n(w+6)(L-r+9)+8h]/4 


Figure 21 Module IeJ: Carriage return in glider and fold back folding. 
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ModufdRlgtm 

Length = 4h 



Width = 4 


Figure 22 Module m Term. 


Module G: Read Copy Line-Feed read 0 
Length = 6h-1 



Figure 23 Module [G : Reading a Zero in glider folding. 
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Module Gfc Read Copy Line-Feed copying 0 
Length = 6h-1 


A:lnit i 


7///G: Read Copy 
“ ' Line-Feed 



Width = 6 


Figure 25 Module |G |: Copying a Zero in switchback folding. 
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Module G: Read Copy Line-Feed copying 1 
Length = 6h-1 

li VZ/ G ; Read Copy 
Line-Feed 



Width = 6 


Figure 26 Module |G|: Copying a One in switchback folding. 
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Module 0; Read Copy Une-Feed feeding line 



Figure 27 Module [G : Feeding a new line in glider folding. 
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Figure 28 Encoding of a clause by a target seed (in blue and orange) and conformation (in 
purple): if there is at least one attraction between the orange bead and the seed, exactly this 
conformation is produced. Else, other conformations (not in the targets) are producible. 


Xi = true whenever literal Xi is attracted to any bead type in the rule, and Xi = false else. 

However, we need to enforce that for all i, Xi and -< Xi are not both attracted to other 
bead^J We add another n targets to make sure that Xi and -i x^ are not both chosen by 
the rule: in the target conformation shown on Figure [29| the first bead produced has two 
neighboring beads from the seed. If the first bead were attracted to both Xi and -> x^ another 
conformation, not in the targets, would be producible, with the first bead next to Xi and -i Xi. 



Figure 29 This set of target seed (in blue and orange) and conformation (the purple bead) makes 
sure that Xi and -i Xi are not both picked by the rule at the same time. 

Finally, this proof works for delay time 1. Extending it a larger delay time 5 > 1 can be 
done by adding a bead in the seed, 5 points away from the first bead produced, for each of 
the seeds. Then, add a straight line of length 5 — 1 to that point in the target conformation. 

◄ 


F.2 An FPT algorithm 

Theorem [2} The rule design problem with n target conformations, each of length l, and 
delay time 5 is NP-complete but fixed-parameter tractable, as it can be solved in time and 
space complexity I2 9n ( 5+1 ) . 

Proof. The FPT algorithm solves reachability in a graph of partial rules: for each i E 
{0,1 ,,1 — 5}, let Bi be the set of all bead types that beads i,i + l,...,i + £ can be adjacent 
to in all target conformations, assuming all beads from 0 to i (inclusive) are placed correctly 
in each of the target conformations. 

Then, let 7 Zi be the set of all possible symmetric relations on {i, i + 1,..., i + U Bi. We 
say that two binary relations R E IZi and T E Ri+i are compatible if R D / = T D /, where 
/ = ({i, i + l,...,(J + l}U5jU Bi+ 1 ) 2 . 


3 


If neither is selected, setting Xi to either true or false does not change the result. 
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In other words, R and T are compatible if and only if they are equal on their restriction 
to their common beads in the primary structure. 

We then solve reachability from all rules of TZo to any rule of 7Zis, in the graph whose 
set of vertices is V = Ui=o and E is compatibility relation on V. 

Now, in each target conformation, there are at most 3(£ + l) 2 beads in a radius S , including 
beads of the primary structure. Therefore, for alH, |{i, i + 1,..., i + U Bi\ ^ 3n(S + l) 2 , 
and hence, \TZi\ ^ 2 9n ^ +1 ) . This means that we are solving reachability in a graph of size 
at most /2 9n2 ( <5+1 ) 4 , hence our result^] ◄ 
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This space complexity might not make the algorithm usable on actual machines, but by generating the 
7ZiS lazily, we were able to compute rules for the counter, and some modules of the Turing simulation. 



